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Different automated theorem provers reason in various deductive systems and, thus, produce proof 
objects which are in general not compatible. To understand and analyze these objects, one needs 
to study the corresponding proof theory, and then study the language used to represent proofs, on a 
prover by prover basis. In this work we present an implementation that takes SMT and Connection 
proof objects from two different provers and imports them both as expansion trees. By representing 
the proofs in the same framework, all the algorithms and tools available for expansion trees (com¬ 
pression, visualization, sequent calculus proof construction, proof checking, etc.) can be employed 
uniformly. The expansion proofs can also be used as a validation tool for the proof objects produced. 

1 Introduction 

The field of proof theory has evolved in such a way to create the most various proof abstractions. Natural 
deduction, sequent calculus, resolution, tableaux, SAT, are only a few of them, and even within the same 
formalism there might be many variations. As a result, automated theorem provers will generate different 
proof objects, usually corresponding to their internal proof representation. The use of distinct formats 
has some disadvantages: provers cannot recognize each others proofs; proofs cannot be easily compared; 
all analysis and algorithms need to be developed on a prover by prover basis. 

GAPT is a framework for proof theory that is able to represent, process and visualize proofs. Cur¬ 
rently it implements the sequent calculus LK (with or without equality rules) for first and higher or¬ 
der classical logic, Robinson’s resolution calculus ifTTl . the schematic calculus LKS [4] and expansion 
trees lO. GAPT also provides algorithms for translating proofs between some of these formats, for 
cut-elimination (reductive methods a la Gentzen [51 and CERES JU), and for cut-introduction (proof 
compression) [1161, as well as an interactive proof visualization tool |[3|. But all these tools depend on 
having proofs to operate on. 

In this work we show how to parse and translate SMT and Connection proofs from veriT and lean- 
CoP, respectively, into expansion proofs in GAPT. SMT are unsatisfiability proofs with respect to some 
theory and, in veriT, these are represented by resolution refutations of a set including (instances of) the 
axioms of the theory considered and the negation of the input formula. Connection proofs decide first- 
order logic formulas by connecting literals of opposite polarity in the clausal normal form of the input. 
These different conceptions of proofs will be unified under the form of expansion proofs, which can be 
considered a compact representation of sequent calculus proofs. 

The advantages of this work is three-fold. Eirst of all, the use of expansion proofs provides a compact 
representation for otherwise big and hard to grasp proof objects. Using this representation and GAPT’s 
visualization tool, it is easy to see the theorem that was proved and the instances of quantified formulas 
used. Second of all, the use of a common representation facilitates the comparison of proofs and makes 
it possible to run and analyse algorithms developed for this representation without the need to adapt 
it to different formats. In particular, we have been using the imported proofs for experimenting proof 
compression via introduction of cuts |h|. Einally, it provides a simple sanity-check procedure and the 
possibility of building EK proofs. 
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This paper is organized as follows. Section |2] defines basic concepts and extends the usual definition 
of expansion trees to accommodate polarities. Section |3] explains how to extract the necessary informa¬ 
tion from both formats and how it is then used to build expansion trees. Section |4] presents the results 
of the transformation applied to a database of proofs in the considered formats. It also discusses the 
advantages of having the proofs as expansion trees. Section |5]discusses some related work and, finally, 
Section [^concludes the paper pointing to future work. 

2 Expansion proofs 

We will work in the setting of first-order classical logic. We introduce now a few basic concepts. 

Definition 1 (Polarity in a sequent). Let S = A\, ...,An\- be a sequent. We will say that formulas 

on the left side of\-, i.e, Ai, ...,A„ have negative polarity while formulas on the right, i.e., Bi,...,Bm have 
positive polarity. 

Definition 2 (Polarity). Let F be a formula and F' a sub-formula of F. Then we can define the polarity 
ofF' in F, i.e., F' can be positive or negative in F, according to the following criteria: 

• IfF = F', then F' has the same polarity as F. 

• If F = A AB or F = Ay B or F = 'ix.A or F = 3x.A and F is positive (negative), than A and B are 
positive (negative). 

• IfF =A^B and F is positive (negative), then A is negative (positive) and B is positive (negative). 

• If F = -lA and ' is positive (negative) then A is negative (positive). 

Throughout this document we will use 0 for negative polarity, 1 for positive polarity and p to denote 
the opposite polarity of p, for p e{0,i}. 

Definition 3 (Strong and weak quantifiers). Let F be a formula. If'ix occurs positively (negatively) in 
F, then Mx is called a strong (weak) quantifier. If3x occurs positively (negatively) in F, then 3x is called 
a weak (strong) quantifier. 

Strong quantifiers in a sequent will be those introduced by the inferences and 3/ in a sequent 
calculus proof. 

Expansion proofs are a compact representation for first and higher order sequent calculus proofs. 
They can be seen as a generalization of Gentzen’s mid-sequent theorem to formulas which are not nec¬ 
essarily prenex O. Expansion proofs are composed by expansion trees. An expansion tree of a formula 
F has this formula as its root. Eeaves are atoms occurring in F and inner nodes are connectives or a 
quantified sub-formula of F. The edges from quantified nodes to its children are labelled with terms that 
were used to instantiate the outer-most quantifier. We extend the original definition with the notion of 
formula polarity and use IT and A for strong and weak quantifiers respectively in expansion trees. 

Definition 4 (Expansion tree). Expansion trees and a function Sh(£',p) (for shallow), that maps an 
expansion tree E to a formula with polarity p € {0,1}, are defined inductively as follows: 

• If A is an atom, then A is an expansion tree with top node A and Sh(A,p) = A for any choice of p. 

• IfEo is an expansion tree, then E = -iEq A an expansion tree with Sh(£',p) = -iSh(£'o;/^)- 

• If E\ and E 2 are expansion trees and o g {A,V}, then E = E\ o E 2 is an expansion tree with 

Sh(E,p) = Sh(Ei,p) oSh(E2,p)- 

• lfE\ andE 2 are expansion trees, then E = Ei ^E 2 is an expansion tree with Sh(£',p) = Sh(£'i,p) — 

Sh(E2,p)- 
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• If {ti, is a set of terms and El,...,E„ are expansion trees with Sh(Ei,p) =A[x/ti], then E = 

Ax.A +’' El... En (denoting a node with n children) is an expansion tree with Sh(£',0) = \/x.A 

and Sh(£', 1) = 3x.A. 

• IfEd is an expansion tree with Sh(£'o,p) = A[;c/o;] for an Eigenvariable a, then E = Y\x.A +“£"0 
is an expansion tree with Sh(£',0) = 3x.A and Sh(£', 1) = \/x.A. 

Expansion trees can be mapped to a quantifier free formula via the deep function, which we also 
redefine faking fhe polarifies info accounf. 

Definition 5. We define the function Dp(-,p) (for deep), p G {0, 1}, that maps an expansion tree to a 
quantifier free formula of polarity p as: 


Dp(A,p) = Afar an atom A. 

= ^Dp(A,p) 

Dp(A oB,p) = Dp(A,p) o Dp(B,p) for o G 
{A,V} 


Dp(A ^ B,p) = Dp(A,;p) ^ Dp(B,p) 
Dp(Ax.A+^‘£i...+^'-£„,0) = ALi Dp(£,-,0) 
Dp(Ax.A+^‘£ 1 .1) = VLi Dp(£,-,1) 
Dp(nr.A+“£,p) = Dp(£,p) 


Definition 6 (Expansion sequent). An expansion sequent e is denoted by Ei,...,E„ \- Ei,...,E^ where £,• 
andEi are expansion trees. Its deep sequent is the sequent Dp(£i,0),...,Dp(£„,0) h Dp(£i, 1),..., Dp(£,„, 1) 
and its shallow sequent is Sh(£i,0),Sh(£„,0) h Sh(£i, 1),Sh(£,„, 1). 

An expansion sequent may or may not represent a proof. To decide whether this is the case, we need 
to reason on the dependency relation in the sequent. 

Definition 7 (Domination). A term t is said to dominate a node N in an expansion tree if it labels a 
parent node ofN. 

Definition 8 (Dependency relation). Let e be an expansion sequent and let <£ be the binary relation on 
the occurrences of terms in e defined as: t <gS if there is an xfree in s that is an eigenvariable of a node 
dominated by t. Then <e, the transitive closure of <% is called the dependency relation ofE. 

Definition 9 (Expansion proof). An expansion sequent is considered an expansion proof if its deep 
sequent is a tautology and the dependency relation is acyclic. 

Intuitively, the dependency relation gives an ordering of quantifier inferences in a sequent calculus 
proof of the shallow sequent of e. That is, t <e s means that the existential quantifiers instantiated with t 
must occur lower in the proof than those instantiated with s. Using this relation it is possible to build an 
EK proof from an expansion proof f8|. 

3 Importing 

gaptQ is a framework for proof transformations implemented in the programming language Scala. 

It supports different proof formats, such as EK (with or without equality) for first and higher order 
logic, Robinson’s resolution calculus ifTTI . the schematic calculus EKS [4] and, more recently, expansion 
trees. It provides various algorithms for proofs, such as reductive cut-elimination (21, cut-elimination by 
resolution ||2l, cut-introduction |2jl, Skolemization, and translations between the proof formats. GAPT 
also comes with proof tool Q, an interactive proof visualization tool supporting all these formats. 

VeriT and leanCoP are automated theorem provers that produce unsatisfiability (in the shape of a 
resolution refutation) and connection proofs respectively. Both output the proof objects to a structured 

'https://github.com/gapt/gapt 
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text file, having in common the fact that all inferences are listed with the operands and the conclusion. 
We have implemented parsers (using Scala’s parser combinators) for both formats in GAPT. By taking 
the necessary information of each proof file and processing if accordingly, we can build expansion proofs. 
We explain fhe kind of processing needed for each formal in Secfions [3dl and W!2\ 

The expansion free of a formula wifh associafed subsfifufions fo ils bound variables can be defined 
as follows: 

Definition 10. Let F be a formula in which all bound variables have pairwise distinct names, E a set 
of substitutions for these variables and p € {0,1} a polarity. Assume that each strong quantifier in F 
is bound to exactly one term in E. Vfe define the function ET(f,r,p) that translates a formula to an 
expansion tree as follows: 

• ET {A,E,p) = A, where A is an atom. 

. EJ{^A,E,p) = ^EJ{A,E,p). 

• EJ{AoB,E,p) = EJ{A,E,p)oEJ{B,E,p),foro€ {A,V}. 

• EJ{A^B,E,p) = EJ{A,E,p) EJ{B,E,p). 

• ET(Vx.A,r,0) = Ax.A+^* ET(Aai, {ai},0)... +^" ET(Aa„, {a„},0), where Oi is the substitution in 
E mapping x to ti (n is the number of times the weak quantifier was instantiated). 

• ET(Vx.A,r, 1) = njc.A +“ ET {Ao', jcj'}, 1) where o' is the substitution in E mapping x to a. 

• ET(3x.A,r,0) = nji:.A +“ ET(Aa^, {a'},0) where o' is the substitution in E mapping x to ot. 

• ET(3x.A,r, 1) = Ar.A+^* ET(Aai, (aij, 1)... +^" ET {Ao„, {a„}, 1), where o, is the substitution in 
E mapping x to ti (n is the number of times the weak quentifier was instatiated). 

Note that the term a used for the strong quantifiers is determined by the substitution set E. If the 
eigenvariable condition is not satisfied in these substitutions, then the resulting expansion tree will not 
be a proof of the formula. 

Using fhe EJ{F,o,p) Iransformalion, if is also possible fo define fhe expansion sequenf e from a 
sequenf S. 

Definition 11. Let S : Ai,...,A„ h Bi,...,B,„ be a sequent with pairwise distinct bound variables and 
o a set of substitutions for those variables such that each strongly quantified variable is bound to 
exactly one term. Then we define ET(5',a) as the expansion sequent ET(Ai,a,0),...,ET(A„,a,0) h 
ET(Bi,a,l),...,ET(B™,a,l). 

Definitions [TO] and [H] show how to build an expansion sequent from a sequent and a set of substi¬ 
tutions. The requirement of pairwise distinct variables can be easily satisfied by a variable renaming. 
The second requirement, that each variable of a strong quantifier is bound only once, might not be true 
for arbitrary proofs. Fortunately, it holds for the proofs we are dealing with, either because the input 
problem contains no strong quantifiers, or because the end-sequent is skolemized. On the second case, 
it is possible to deduce unique Eigenvariables for each strong quantifier and obtain the expansion tree of 
the un-skolemized formula. 

Lemma 1. Sh(ET(f,a,p),p) =f 

Proof. Follows from the definition of ET(F, o,p) and Sh(F,p). □ 

Theorem 1. A sequent S with substitutions <7, such that each strongly quantified variable in S is bound 
exactly once, is valid iff the expansion sequent ET(5, o) is an expansion proof. 
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Proof. By the soundness and completeness of expansion sequents |[8]|, we know that an expansion se¬ 
quent £ is an expansion proof iff its shallow sequent is valid. From Lemma [U we have that the shallow 
sequent of ET(S, a) is S. Therefore, S is valid iff ET(5, a) is an expansion proof. □ 

This theorem provides a “sanity-check” for the expansion sequents extracted from proof objects. If it 
is an expansion proof, we know that, at least, the end-sequent with the given substitutions is a tautology. 
Note that this does not provide a check for the proof, as it is not validating each inference applied, but 
only if the claimed instantiations can actually lead to a proof. 

3.1 SMT proofs 

SMT {Satisfiability Modulo Theory) is a decision procedure for first-order formulas with respect to a 
background theory. It can be seen as a generalization of SAT problems. Veril@ is an open-source SMT- 
solver which is complete for quantifier-free formulas with uninterpreted functions and difference logic 
on reals and integers. For this work we have used the proof objects produced by VeriT on the QF_UF 
(quantifier-free formulas with uninterpreted function symbols) problems of the SMT-LIbU The back¬ 
ground theory in this case was the equality theory composed by the axioms (symmetry and reflexivity 
are implicit): 

Vxq...\/x„.{xo =Xi a ... AXn-l = Xn ^ Xq = X „) 

Vxo---Vx„Vyo---Vy„.((xo =yoA... =y„ -^/(xo,...,x„) =/(yo,■•■An)) 

Vxo...yx„Vyo...yyn-{xo = yo A ■■■ Ax„ =y„ Ap{xo,...,Xn) p(yo,■■■An)) 

The proofs generated are composed of CNF transformations and a resolution refutation, whose leaves 
are either one of the quantifier-free formulas from the input problem or an instance of an equality axiom. 
The proof object consists of a comprehensive list of labelled clauses used in the resolution proof and 
their origin. They are either an input clause, without ancestors, or the result of an inference rule on other 
clauses, which is specified via the labels. VeriT’s proof is purely propositional and no substitutions are 
involved, since the axioms are quantifier-free and contain no free-variables. 

The input problem is propositional, therefore the only substitutions needed were the ones instantiat¬ 
ing the (weak) quantifiers of the equality axiomfl These are found by collecting the ground instances of 
these axioms occurring on the leaves of the resolution proof and using a first-order matching algorithm. 
By matching the instances with the appropriate axiom (without the quantifiers), we can obtain the sub¬ 
stitutions for the quantified variables. Given those substitutions and the quantified axioms, we can build 
the expansion trees. It is worth noting that the quantified equality axioms (i.e., transitivity, symmetry, 
reflexivity, etc.) are build internally in GAPT, since these are not part of the proof object. Also, the 
reflexivity instances needed are computed separately, since these are implicit in veriT. The expansion 
tree of the (propositional) input formula can be built with an empty set of substitutions. Since these are 
unsatisfiability proofs, all expansion trees will be on the left side of the expansion sequent. 

3.2 Connection proofs 

Connection calculi is a set of formalisms for deciding first-order classical formulas which consists on 
connecting unifiable literals of opposite polarities from the input. Proof search in these calculi is charac¬ 
terized as goal-oriented and, in general, non-confluent. LeanCofU is a connection based theorem prover 
that implements a series of techniques for reducing the search space and making proof search feasible 

^http: //www. verit-solver. org/ 

■http://smt-lib.org/ 

^Observe that we do not need any information from the inference steps. 

-http://leancop.de/ 
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ifTOll . Although its strategy is incomplete, it achieves very good performance in practice. For this work, 
leanCoP 2.2 was used. It can be obtained from the CASC24 competition websit^^ or, alternatively, 
executed online at SystemOnTPTF0. 

Given an input problem (a set of axioms and conjectures in the language of first-order logic), leanCoP 
will negate the axioms, skolemize the formulas and translate them into a disjunctive normal form (DNF). 
It works with a positive representation of the problem and uses a special DNF transformation that is more 
suitable for connection proof search ifTOl . The prover also adds equality axioms when necessary. Lean¬ 
CoP is able to produce proof objects in four different formats For this work, we have used leantptp, 
which is closer to the TPTP (thousands of problems for theorem provers) specification 1(121 . The output 
file is divided in three parts: (1) input formulas; (2) clauses generated from the DNF transformation of 
the input and equality axioms; and (3) proof description. Each part is described using a set of predicates 
with the relevant information. 

In part (I), the formulas from the input file are listed and named. Their variables are renamed 
such that they are pairwise distinct. Moreover, formulas are annotated with respect to their role, e.g, 
axiom or conjecture. Part (2) contains the clauses, in the form of a list of literals, that resulted from 
the disjunctive normal form transformation. This can either be the regular naive DNF translation or a 
definitional clausal form transformation, which assigns new predicates to some formulas. Each clause 
is numbered and associated with the name of the formula that generated it. Equality axioms are labelled 
with a special keyword, since they do not come from any transformation on the input formulas. The 
proof per se is in part (3), where each line is an inference rule. It contains the number of the clause to 
which the inference was applied, the bindings used (if any) and the resulting clause. 

Eor building the expansion trees of the input formulas we need the substitutions used in the proof and 
the Skolem terms introduced during Skolemization. The substitutions will be the terms of the expansion 
tree’s weak quantifiers and the Skolem terms, translated to variables, will be the expansion tree’s strong 
quantifier terms. In the leanCoP proofs, Skolem terms have a specific syntax, so they can be identified 
and parsed as “Eigenvariables”. We use this approach to get an expansion proof of the original problem, 
instead of the skolemized problem. Since each strong quantifier is replaced by exactly one Skolem term, 
the condition for the set of substitutions in Definition [TO] is satisfied. 

The collection of terms used for the weak quantifiers is a bit more involved due to variable renaming. 
The quantified variables in the input formula are renamed during the clausal normal form transformation. 
This means that the sets of variables occurring in the original problem and in the clauses are disjoint. 
The substitutions used in the proof are given with respect to the clauses’ variables, but we are interested 
in building expansion trees of the input formulas. We need therefore to find a way to map the variables 
in the clauses to the variables in the input formulas. 

The solution found was to implement in GAPT the definitional clausal form transformation, trying 
to remain as faithful as possible to the one leanCoP uses, but without the variable renaming. After 
applying our transformation to the input formulas, we try to match the clauses obtained to the clauses 
from the proof object. The first-order matching algorithm returns a substitution if a match is found. 
Such substitution maps strongly quantified variables to “Eigenvariables” (the result of parsing Skolem 
terms), and weakly quantified variables to their renamed versions used in the clauses. By composing this 
substitution with the ones obtained from the bindings in the proof, we are able to correctly identify the 
terms used for each quantified variable in the input formulas. 


^http://pages.cs.miami.edu/~tptp/CASC/24/Systems.tgz 
'http://pages.cs.miami.edu/~tptp/cgi-bin/SystemOnTPTP 
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4 Results 

We were able to import as expansion trees all the 142 proof objects provided to us by the veriT team, 
and all but one under one minute. The expansion sequents generated have been used as input for the 
cut-introduction algorithm |i6| and some of their features (e.g. high number of instances) have motivated 
improvements to the algorithm. As for leanCoP, our database consists of 3043 proofs of problems from 
the TPTP library |[T2l . Of those, we can successfully import 1224 as expansion sequents. Some errors 
still occur while parsing and matching (e.g. our generated clauses do not have the same literal ordering 
as the clauses in the proof file), but we are working to increase the success rate. 

Getting proofs from various theorem provers in the shape of expansion sequents allows us to do a 
number of interesting things. First of all, one can visualize the end-sequent and the instances used of 
each quantified formula. This is much more comfortable and easier to grasp than a raw text file. If is 
also possible fo check whefher fhe insfances used lead indeed fo a proof of fhe end-sequenf. This is 
reduced fo checking if fhe deep sequenf of fhe expansion sequenf is a faufology (which can be done, as 
fhis sequenf is propositional) and if fhe dependency relafion is acyclic. In case fhe expansion sequenf is 
a proof, we can build an LK proof from if, using fhe dependency relafion fo decide fhe order in which 
quanfifiers are infroduced I'S.]. Finally, one can affempf proof compression and discovery of lemmas using 
fhe cuf-inlroduclion algorifhm |I^|. 

All of fhese funcfionalilies are implemented in GAPT. The sysfem comes wifh an inferacfive com¬ 
mand line where commands for loading proofs, opening proof tool, infroducing cufs, eliminating cufs, 
building an LK proof from an expansion sequenf, among ofhers, can be issued. Some examples of proofs 
imported and fheir visualizations can be found af https: //www. logic. at/staff/giselle/examples .pdf 

5 Related Work 

Ofher projecfs and fools also address fhe issues of proof visualization and checking. For proofs in fhe 
TPTP language in particular, fhere is IDV ifT^ . which provides an inferacfive interface for manipulating 
fhe DAG represenfing a derivation. This fool focuses solely on visualization of proofs in fhe TPTP 
formal. Our work aims on a more general framework, of which visualizafion is only a small part. We are 
also capable fo import differenl proof objecfs, nol only fhose in fhe TPTP language. 

As for proof checking, 0 proposes a check of leanCoP proofs in HOL Lighf white im shows how 
fo check SAT and SMT proofs using Coq. The former paper involved re-implemenfing leanCoP’s kernel 
in HOL Lighf, which differs a lol from our approach of simply parsing fhe oulpuls of fheorem provers. 

In fhe teller, proofs produces by SAT/SMT fheorem provers are certified by Coq. We musl clarify lhal, 
given fhe informalion needed fo produce expansion proofs, if is nol fair fo claim we are checking proof 
objecfs, bul we merely have a sanily check lhal fhe insfances used by fhe fheorem prover aclually lead fo 
a proof of fhe proposed fheorem. Such compromise makes sense if we wanl a framework general enough 
fo deal wifh differenl proof objecfs, wifhouf asking any change on fhe side of fheorem provers. 

Finally, if is worlh menlioning ProofCerl f91, a research projecl wifh fhe aim of developing a Iheorel- 
ical framework for proof represenlalion. In order nol fo make such compromise, and aclually check each 
step of each proof for various differenl proof objecfs, a solid foundation of proof specification needs fo 
be developed. White fhis does nol happen, fhis work shows how if is sfill possible fo combine exisling 
proof objecfs info one represenlalion. 

6 Conclusion 

We have shown how SMT and Connection proofs can be bolh imporled as expansion sequenls. The 
informalion needed from fhe proof objecfs is jusl fhe end-sequenf being proven and a sel of insfances 
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used for the quantified formulas. For both cases presented we relied on a first-order matching algorithm, 
but this requirement can be lifted if all substitutions are provided directly in the proof object. 

The representation using expansion sequents serves various purposes. It provides an easy proof 
visualization, a simple checking procedure, LK proof construction and introduction of cuts. 

This is an ongoing work, and we hope to have many developments in the near future. In particular, 
the difficulties in importing leanCoP proofs remain to be resolved. This procedure also offers a lot of 
room for optimization. Once we have a big enough set of parsed leanCoP proofs, we will add those to the 
benchmark used in the cut-introduction algorithm. As for veriT proofs, we plan to test bigger examples, 
as the ones provided are only a small subset from the SMT-LIB. 

Another future goal is importing other formats from other provers and comparing the different proofs 
for the same input problem. We also aim on integrating a check for whether the obtained expansion 
sequent is an expansion proof in the import function. 
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